-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-910] Multithreading inference. #12456
Conversation
Shouldn't we hide this abstraction from our users? The engine should be smart enough to determine when to multithread - the APIs just have to support concurrent calls. |
This PR is mainly for demo. |
84cd54b
to
aaaa725
Compare
return EXIT_FAILURE; | ||
} | ||
|
||
std::string test_file(argv[1]); | ||
int num_threads = std::atoi(argv[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example interface changed. Add a default value for num_threads?
include/mxnet/c_predict_api.h
Outdated
* enough to keep `num_threads` predictors. | ||
* \return 0 when success, -1 when failure. | ||
*/ | ||
MXNET_DLL int MXPredCreateMultithread(const char* symbol_json_str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MultiThread?
This reverts commit b9d844e.
937e139
to
c21ba35
Compare
ret->out_shapes = out_shapes; | ||
ret->out_arrays = ret->exec->outputs(); | ||
|
||
if (!lazy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this made lazy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental problem here is that if we create multiple executors in the same thread (e.g., in the main thread), these executors will share the same temporary resources, which leads to race condition when these executors are used in different threads. To fix this problem, here we avoid creating executors when we create predictors in the main thread. The executors are actually created when the predictor is used in the worker thread for the first time. As long as the executor is always used in this worker thread, there won't be race condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fundamental problem here is that if we create multiple executors in the same thread (e.g., in the main thread), these executors will share the same temporary resources, which leads to race condition when these executors are used in different threads. To fix this problem, here we avoid creating executors when we create predictors in the main thread. The executors are actually created when the predictor is used in the worker thread for the first time. As long as the executor is always used in this worker thread, there won't be race condition.
If I use 10 different PredictorHandles created by
MXPredCreate() in the main thread and for each PredictorHandle calling MXPredSetInput(), MXPredForward() and MXPredGetOutput() functions to inference in 10 threads, Is it safe?
Thread can get one current available PredictorHandle to inference, therefore, one certain thread may get different PredictorHandle to inference. Is this safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can give it a try. i'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can give it a try. i'm not sure.
I got dead lock in MXPredSetInput(), MXPredForward() or MXPredGetOutput(), It seems it doesn't support.
* add multi-threading inference. * demo multi-threading inference. * add new capi. * make naive engine thread local. * create an executor inside each thread. * fix format. * fix format. * fix format. * Revert "make naive engine thread local." This reverts commit b9d844e. * Update CAPI. * add doc. * fix lint. * update example. * update. * fix. * add check. * fix. * fix example. * update name. * update README.
Description
MXNet Executor isn't thread-safe and thus the predictor is also not thread-safe. However, there is a use case that we want to parallel inference of the same model with multiple threads. In this case, we need to create multiple executors that share the same weight arrays. The current C predict API doesn't support this. As such, we add a new API to create multiple predictors for parallel inference and show the use of this new API in the example of image classification.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments